Model Selection

Encoder-Decoder Architecture

# Encoder-Decoder Architecture

MrT5 is an efficient byte-level language model based on ByT5 improvements, reducing input sequence length by approximately 50% through dynamic token merging technology

Large Language Model

Transformers Supports Multiple Languages

Shuka v1 is a language model natively supporting Indian language audio understanding, combining a self-developed audio encoder with the Llama3-8B-Instruct decoder, enabling zero-shot multilingual question-answering tasks.

Transformers Supports Multiple Languages

Pile-T5 Base is an encoder-decoder model trained on The Pile dataset using the T5x library, trained for 2 million steps with MLM objective, approximately 2 trillion tokens.

Large Language Model

Transformers English

Pile-T5 XXL is an encoder-decoder model trained on The Pile dataset using the T5x library, employing a MLM objective similar to the original T5 model, trained for 2 million steps (approximately 2 trillion tokens).

Large Language Model

Transformers English

MedICap is an encoder-decoder model for medical image captioning, which won the championship in the ImageCLEFmedical Caption 2023 challenge.

Vlt5 Base Keywords

An encoder-decoder keyword generation model based on Google's Transformer architecture, supporting Polish and English, primarily used for extracting keywords from scientific paper abstracts.

Text Generation

Transformers Supports Multiple Languages

Bert Mini2bert Mini Finetuned Cnn Daily Mail Summarization

This is an encoder-decoder model based on the BERT-mini architecture, specifically fine-tuned for the CNN/Dailymail dataset for text summarization tasks.

Text Generation

Transformers English

Wav2vec2 Large Xlsr 53 German Gpt2

This is an automatic speech recognition encoder-decoder model trained on the MOZILLA-FOUNDATION/COMMON_VOICE_7_0 German dataset, combining the strengths of Wav2Vec2 and GPT2 architectures.

Speech Recognition

Transformers German

Bert2bert Turkish Paraphrase Generation

A Turkish paraphrase generation model based on the Bert2Bert architecture, used to generate sentences with the same meaning but different expressions.

Text Generation

Transformers Other

Roberta2roberta L 24 Cnn Daily Mail

An encoder-decoder model initialized with RoBERTa-Large, specifically designed for summarization tasks and fine-tuned on the CNN/DailyMail dataset.

Text Generation

Transformers English

Encoder Decoder Es

An encoder-decoder model fine-tuned on the cc_news_es_titles dataset for Spanish text processing tasks

Large Language Model

amazon-sagemaker-community

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase